智能论文笔记

Decoupled Knowledge Distillation

Borui Zhao , Quan Cui , Renjie Song , Yiyu Qiu , Jiajun Liang

分类：计算机视觉 | 人工智能

2022-03-16

最先进的蒸馏方法主要基于中间层的深层特征，而logit蒸馏的重要性被极大地忽略了。为了提供研究逻辑蒸馏的新观点，我们将经典的KD损失重新分为两个部分，即目标类知识蒸馏（TCKD）和非目标类知识蒸馏（NCKD）。我们凭经验研究并证明了这两个部分的影响：TCKD转移有关训练样本“难度”的知识，而NCKD是Logit蒸馏起作用的重要原因。更重要的是，我们揭示了经典的KD损失是一种耦合的配方，该配方抑制了NCKD的有效性，并且（2）限制了平衡这两个部分的灵活性。为了解决这些问题，我们提出了脱钩的知识蒸馏（DKD），使TCKD和NCKD能够更有效，更灵活地发挥其角色。与基于功能的复杂方法相比，我们的DKD可相当甚至更好的结果，并且在CIFAR-100，ImageNet和MS-Coco数据集上具有更好的培训效率，用于图像分类和对象检测任务。本文证明了Logit蒸馏的巨大潜力，我们希望它对未来的研究有所帮助。该代码可从https://github.com/megvii-research/mdistiller获得。

translated by 谷歌翻译

ImageTBAD: A 3D Computed Tomography Angiography Image Dataset for Automatic Segmentation of Type-B Aortic Dissection

Zeyang Yao , Jiawei Zhang , Hailong Qiu , Tianchen Wang , Yiyu Shi , Jian Zhuang , Yuhao Dong , Meiping Huang , Xiaowei Xu

分类：计算机视觉

2021-09-01

B型主动脉解剖（TBAD）是最严重的心血管事件之一，其特征在于每年的年龄发病率，以及疾病预后的严重程度。目前，计算机断层摄影血管造影（CTA）已被广泛采用TBAD的诊断和预后。 CTA中真菌（TL），假腔（FL）和假腔血栓（FLT）的精确分割对于解剖学特征的精确定量，CTA是至关重要的。然而，现有的作品仅关注TL和FL而不考虑FLT。在本文中，我们提出了ImageTBAD，TBAD的第一个3D计算断层造影血管造影（CTA）图像数据集具有TL，FL和FLT的注释。该建议的数据集包含100个TBAD CTA图像，与现有的医学成像数据集相比，这是体面的大小。由于FLT几乎可以沿着主动脉出现具有不规则形状的主动脉，FLT的分割呈现了各种各样的分割问题，其中目标存在于具有不规则形状的各种位置。我们进一步提出了一种用于TBAD的自动分割的基线方法。结果表明，基线方法可以通过现有的主动脉和TL分段实现与现有工作的可比结果。然而，FLT的分割精度仅为52％，这使大型改进室并显示了我们数据集的挑战。为了促进进一步研究这一具有挑战性的问题，我们的数据集和代码将发布给公众。

translated by 谷歌翻译

Protein-Ligand Complex Generator & Drug Screening via Tiered Tensor Transform

Jonathan P. Mailoa , Zhaofeng Ye , Jiezhong Qiu , Chang-Yu Hsieh , Shengyu Zhang

分类：神经与进化计算

2023-01-03

Accurate determination of a small molecule candidate (ligand) binding pose in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate are closer to experimental co-crystal structures than those generated by docking software, and more importantly achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.

translated by 谷歌翻译

Learning to Maximize Mutual Information for Dynamic Feature Selection

Ian Covert , Wei Qiu , Mingyu Lu , Nayoon Kim , Nathan White , Su-In Lee

分类：机器学习 | (统计)机器学习

2023-01-02

Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning (RL), but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality and outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.

translated by 谷歌翻译

Tracing the Origin of Adversarial Attack for Forensic Investigation and Deterrence

Han Fang , Jiyi Zhang , Yupeng Qiu , Ke Xu , Chengfang Fang , Ee-Chien Chang

分类：计算机视觉 | 机器学习

2022-12-31

Deep neural networks are vulnerable to adversarial attacks. In this paper, we take the role of investigators who want to trace the attack and identify the source, that is, the particular model which the adversarial examples are generated from. Techniques derived would aid forensic investigation of attack incidents and serve as deterrence to potential attacks. We consider the buyers-seller setting where a machine learning model is to be distributed to various buyers and each buyer receives a slightly different copy with same functionality. A malicious buyer generates adversarial examples from a particular copy $\mathcal{M}_i$ and uses them to attack other copies. From these adversarial examples, the investigator wants to identify the source $\mathcal{M}_i$. To address this problem, we propose a two-stage separate-and-trace framework. The model separation stage generates multiple copies of a model for a same classification task. This process injects unique characteristics into each copy so that adversarial examples generated have distinct and traceable features. We give a parallel structure which embeds a ``tracer'' in each copy, and a noise-sensitive training loss to achieve this goal. The tracing stage takes in adversarial examples and a few candidate models, and identifies the likely source. Based on the unique features induced by the noise-sensitive loss function, we could effectively trace the potential adversarial copy by considering the output logits from each tracer. Empirical results show that it is possible to trace the origin of the adversarial example and the mechanism can be applied to a wide range of architectures and datasets.

translated by 谷歌翻译

Learning Spatiotemporal Frequency-Transformer for Low-Quality Video Super-Resolution

Zhongwei Qiu , Huan Yang , Jianlong Fu , Daochang Liu , Chang Xu , Dongmei Fu

分类：人工智能 | 计算机视觉

2022-12-27

Video Super-Resolution (VSR) aims to restore high-resolution (HR) videos from low-resolution (LR) videos. Existing VSR techniques usually recover HR frames by extracting pertinent textures from nearby frames with known degradation processes. Despite significant progress, grand challenges are remained to effectively extract and transmit high-quality textures from high-degraded low-quality sequences, such as blur, additive noises, and compression artifacts. In this work, a novel Frequency-Transformer (FTVSR) is proposed for handling low-quality videos that carry out self-attention in a combined space-time-frequency domain. First, video frames are split into patches and each patch is transformed into spectral maps in which each channel represents a frequency band. It permits a fine-grained self-attention on each frequency band, so that real visual texture can be distinguished from artifacts. Second, a novel dual frequency attention (DFA) mechanism is proposed to capture the global frequency relations and local frequency relations, which can handle different complicated degradation processes in real-world scenarios. Third, we explore different self-attention schemes for video processing in the frequency domain and discover that a ``divided attention'' which conducts a joint space-frequency attention before applying temporal-frequency attention, leads to the best video enhancement quality. Extensive experiments on three widely-used VSR datasets show that FTVSR outperforms state-of-the-art methods on different low-quality videos with clear visual margins. Code and pre-trained models are available at https://github.com/researchmm/FTVSR.

translated by 谷歌翻译

A Novel Self-Supervised Learning-Based Anomaly Node Detection Method Based on an Autoencoder in Wireless Sensor Networks

Miao Ye , Qinghao Zhang , Xingsi Xue , Yong Wang , Qiuxiang Jiang , Hongbing Qiu

分类：机器学习 | 人工智能

2022-12-26

Due to the issue that existing wireless sensor network (WSN)-based anomaly detection methods only consider and analyze temporal features, in this paper, a self-supervised learning-based anomaly node detection method based on an autoencoder is designed. This method integrates temporal WSN data flow feature extraction, spatial position feature extraction and intermodal WSN correlation feature extraction into the design of the autoencoder to make full use of the spatial and temporal information of the WSN for anomaly detection. First, a fully connected network is used to extract the temporal features of nodes by considering a single mode from a local spatial perspective. Second, a graph neural network (GNN) is used to introduce the WSN topology from a global spatial perspective for anomaly detection and extract the spatial and temporal features of the data flows of nodes and their neighbors by considering a single mode. Then, the adaptive fusion method involving weighted summation is used to extract the relevant features between different models. In addition, this paper introduces a gated recurrent unit (GRU) to solve the long-term dependence problem of the time dimension. Eventually, the reconstructed output of the decoder and the hidden layer representation of the autoencoder are fed into a fully connected network to calculate the anomaly probability of the current system. Since the spatial feature extraction operation is advanced, the designed method can be applied to the task of large-scale network anomaly detection by adding a clustering operation. Experiments show that the designed method outperforms the baselines, and the F1 score reaches 90.6%, which is 5.2% higher than those of the existing anomaly detection methods based on unsupervised reconstruction and prediction. Code and model are available at https://github.com/GuetYe/anomaly_detection/GLSL

translated by 谷歌翻译

EVM-CNN: Real-Time Contactless Heart Rate Estimation from Facial Video

Ying Qiu , Yang Liu , Juan Arteaga-Falconi , Haiwei Dong , Abdulmotaleb El Saddik

分类：计算机视觉

2022-12-25

With the increase in health consciousness, noninvasive body monitoring has aroused interest among researchers. As one of the most important pieces of physiological information, researchers have remotely estimated the heart rate (HR) from facial videos in recent years. Although progress has been made over the past few years, there are still some limitations, like the processing time increasing with accuracy and the lack of comprehensive and challenging datasets for use and comparison. Recently, it was shown that HR information can be extracted from facial videos by spatial decomposition and temporal filtering. Inspired by this, a new framework is introduced in this paper to remotely estimate the HR under realistic conditions by combining spatial and temporal filtering and a convolutional neural network. Our proposed approach shows better performance compared with the benchmark on the MMSE-HR dataset in terms of both the average HR estimation and short-time HR estimation. High consistency in short-time HR estimation is observed between our method and the ground truth.

translated by 谷歌翻译

A Lightweight Reconstruction Network for Surface Defect Inspection

Chao Hu , Jian Yao , Weijie Wu , Weibin Qiu , Liqiang Zhu

分类：计算机视觉 | 机器学习

2022-12-25

Currently, most deep learning methods cannot solve the problem of scarcity of industrial product defect samples and significant differences in characteristics. This paper proposes an unsupervised defect detection algorithm based on a reconstruction network, which is realized using only a large number of easily obtained defect-free sample data. The network includes two parts: image reconstruction and surface defect area detection. The reconstruction network is designed through a fully convolutional autoencoder with a lightweight structure. Only a small number of normal samples are used for training so that the reconstruction network can be A defect-free reconstructed image is generated. A function combining structural loss and $\mathit{L}1$ loss is proposed as the loss function of the reconstruction network to solve the problem of poor detection of irregular texture surface defects. Further, the residual of the reconstructed image and the image to be tested is used as the possible region of the defect, and conventional image operations can realize the location of the fault. The unsupervised defect detection algorithm of the proposed reconstruction network is used on multiple defect image sample sets. Compared with other similar algorithms, the results show that the unsupervised defect detection algorithm of the reconstructed network has strong robustness and accuracy.

translated by 谷歌翻译

Mind Your Heart: Stealthy Backdoor Attack on Dynamic Deep Neural Network in Edge Computing

Tian Dong , Ziyuan Zhang , Han Qiu , Tianwei Zhang , Hewu Li , Terry Wang

分类：机器学习

2022-12-22

Transforming off-the-shelf deep neural network (DNN) models into dynamic multi-exit architectures can achieve inference and transmission efficiency by fragmenting and distributing a large DNN model in edge computing scenarios (e.g., edge devices and cloud servers). In this paper, we propose a novel backdoor attack specifically on the dynamic multi-exit DNN models. Particularly, we inject a backdoor by poisoning one DNN model's shallow hidden layers targeting not this vanilla DNN model but only its dynamically deployed multi-exit architectures. Our backdoored vanilla model behaves normally on performance and cannot be activated even with the correct trigger. However, the backdoor will be activated when the victims acquire this model and transform it into a dynamic multi-exit architecture at their deployment. We conduct extensive experiments to prove the effectiveness of our attack on three structures (ResNet-56, VGG-16, and MobileNet) with four datasets (CIFAR-10, SVHN, GTSRB, and Tiny-ImageNet) and our backdoor is stealthy to evade multiple state-of-the-art backdoor detection or removal methods.

translated by 谷歌翻译